22 research outputs found

    Autobin: A predictive approach towards automatic binning using data splitting

    Get PDF
    The concept of binning is known by many names: discretisation, classing, grouping and quantisation. It entails the mapping of continuous or categorical data into discrete bins. Binning is an important pre-processing step in most predictive models and considered a basic data preparation step in building a credit scorecard. Credit scorecards are mathematical models which attempt to provide a quantitative estimate of the probability that a customer will display a defined behaviour (e.g. default) with respect to their current credit position with a lender. Among the practical advantages of binning are the removal of the effects of outliers and a way to handle missing values. Many binning methods exist but they are often time consuming to actually carry out. We propose a new method, Autobin, that is based on data splitting and maximising a cross-validation form of the predicted log-likelihood. Autobin has the advantage of being nearly automatic and requires very little by way of tuning parameters. In a limited simulation study done, it was found that Autobin outperforms its competitors

    Advantages of using factorisation machines as a statistical modelling technique

    Get PDF
    Factorisation machines originated from the field of machine learning literature and have gained popularity because of the high accuracy obtained in several prediction problems, in particular in the area of recommender systems. This article will provide a motivation for the use of factorisation machines, discuss the fundamentals of factorisation machines, and provide examples of some applications and the possible gains by using factorisation machines as part of the statistician’s model-building toolkit. Data sets and existing software packages will be used to illustrate how factorisation machines may be fitted and in what context it is worth being used

    Algorithms for estimating the parameters of factorisation machines

    Get PDF
    Since the introduction of factorisation machines in 2010, it became a popular prediction technique amongst machine learners who applied the method with success in several data science challenges such as Kaggle or KDD Cup. Despite these successes, factorisation machines are not often considered as a modelling technique in business, partly because large companies prefer tried and tested software for model implementation. Popular modelling techniques for prediction problems, such as generalised linear models, neural networks, and classification and regression trees, have been implemented in commercial software such as SAS which is widely used by banks, insurance, pharmaceutical and telecommunication companies. To popularise the use of factorisation machines in business, we implement algorithms for fitting factorisation machines in SAS. These algorithms minimise two loss functions, namely the weighted sum of squared errors and the weighted sum of absolute deviations using coordinate descent and nonlinear programming procedures. Using a simulation study, the above-mentioned routines are tested in terms of accuracy and efficiency. The prediction power of factorisation machines is then illustrated by analysing two data sets

    The Impact Of PD-LGD Correlation On Expected Loss And Economic Capital

    Get PDF
    The Basel regulatory credit risk rules for expected losses require banks use downturn loss given default (LGD) estimates because the correlation between the probability of default (PD) and LGD is not captured, even though this has been repeatedly demonstrated by empirical research. A model is examined which captures this correlation using empirically-observed default frequencies and simulated LGD and default data of a loan portfolio. The model is tested under various conditions dictated by input parameters. Having established an estimate of the impact on expected losses, it is speculated that the model be calibrated using banks' own loss data to compensate for the omission of correlation dependence. Because the model relies on observed default frequencies, it could be used to adapt in real time, forcing provisions to be dynamically allocated

    A Critical Review Of The Basel Margin Of Conservatism Requirement In A Retail Credit Context

    Get PDF
    The Basel II accord (2006) includes guidelines to financial institutions for the estimation of regulatory capital (RC) for retail credit risk. Under the advanced Internal Ratings Based (IRB) approach, the formula suggested for calculating RC is based on the Asymptotic Risk Factor (ASRF) model, which assumes that a borrower will default if the value of its assets were to fall below the value of its debts. The primary inputs needed in this formula are estimates of probability of default (PD), loss given default (LGD) and exposure at default (EAD). Banks for whom usage of the advanced IRB approach have been approved usually obtain these estimates from complex models developed in-house. Basel II recognises that estimates of PDs, LGDs, and EADs are likely to involve unpredictable errors, and then states that, in order to avoid over-optimism, a bank must add to its estimates a margin of conservatism (MoC) that is related to the likely range of errors. Basel II also requires several other measures of conservatism that have to be incorporated. These conservatism requirements lead to confusion among banks and regulators as to what exactly is required as far as a margin of conservatism is concerned. In this paper, we discuss the ASRF model and its shortcomings, as well as Basel II conservatism requirements. We study the MoC concept and review possible approaches for its implementation. Our overall objective is to highlight certain issues regarding shortcomings inherent to a pervasively used model to bank practitioners and regulators and to potentially offer a less confusing interpretation of the MoC concept

    A proposed quantitative credit-rating methodology for South African provincial departments

    Get PDF
    The development of subnational credit-rating methodologies affords benefits for subnationals, the sovereign and its citizens. Trusted credit ratings facilitate access to financial markets and above-average ratings allow for the negotiation of better collateral and guarantee agreements, as well as for funding of, for example, infrastructure projects at superior (lower) interest rates. This paper develops the quantitative section of a credit-rating methodology for South African subnationals. The unique characteristics of South African data, their assembly, and the selection of dependent and independent variables for the linear-regression model chosen, are discussed. The methodology is then applied to the provincial Department of Health using linear regression modelling

    The good, the bad and the ugly of South African fatal road accidents

    No full text
    We reflect on the good, the bad and the ugly of the fatal accidents occurring on South Africa’s roads. The cost of human lives indisputably equates to ‘the ugly’ and the economic cost of accidents associates with ‘the bad’. ‘The good’ relates to the reduction of both these costs that may result from the entrance of self-driving cars into the South African market as well as awareness campaigns like the Arrive Alive National Road Safety Strategy. The general contribution of this paper is to raise awareness of the effects of accidents, more specifically fatal accidents. Current trends in terms of human factors as well as road and environmental factors involved in the fatal accidents on South African roads are summarised. This paper also serves as a preliminary investigation into possible factors influencing these accidents, which ought to be of interest to a very broad readership, more specifically those focusing on risk analysis, and certainly is of interest to any citizen of South Africa.Significance: Awareness is raised of the effects of fatal accidents on South African roads. Current trends in terms of human factors as well as road and environmental factors on road accidents are reflected upon. The futuristic effect of self-driving cars is explored

    The Changing Landscape of Financial Credit Risk Models

    No full text
    The landscape of financial credit risk models is changing rapidly. This study takes a brief look into the future of predictive modelling by considering some factors that influence financial credit risk modelling. The first factor is machine learning. As machine learning expands, it becomes necessary to understand how these techniques work and how they can be applied. The second factor is financial crises. Where predictive models view the future as a reflection of the past, financial crises can violate this assumption. This creates a new field of research on how to adjust predictive models to incorporate forward-looking conditions, which include future expected financial crises. The third factor considers the impact of financial technology (Fintech) on the future of predictive modelling. Fintech creates new applications for predictive modelling and therefore broadens the possibilities in the financial predictive modelling field. This changing landscape causes some challenges but also creates a wealth of opportunities. One way of exploiting these opportunities and managing the associated risks is via industry collaboration. Academics should join hands with industry to create industry-focused training and industry-focused research. In summary, this study made three novel contributions to the field of financial credit risk models. Firstly, it conducts an investigation and provides a comprehensive discussion on three factors that contribute to rapid changes in the credit risk predictive models’ landscape. Secondly, it presents a unique discussion of the challenges and opportunities arising from these factors. Lastly, it proposes an innovative solution, specifically collaboration between academic and industry partners, to effectively manage the challenges and take advantage of the opportunities for mutual benefits
    corecore